CRLVisWorks

VAST 2010 Challenge
Text Records - Investigations into Arms Dealing

Authors and Affiliations:

Lei Shi, IBM Research - China, shllsh@cn.ibm.com [PRIMARY contact]

Weihong Qian, IBM Research - China, qianwh@cn.ibm.com
Furu Wei, IBM Research – China, weifuru@cn.ibm.com

Tool(s):

We use VisWorks TIARA for the text visual analytics part and VisWorks VIGOR for the network analytics one. VisWorks VIGOR is based on VisWorks Peony visualization framework. All these tools are developed by the members of smart visual analytics team, IBM Research China, during 2007~2010.

 

Video:

 

Our Video (in flash format, need to drag the file into browser for display)

 

 

ANSWERS:


MC1.1: Summarize the activities that happened in each country with respect to illegal arms deals based on a synthesis of the information from the different report types and sources.  State the situation in each country at the end of the period (i.e. the end of the information you have been given) with respect to illegal arms deals being pursued.  Present a hypothesis about the next activities you expect to take place, with respect to the people, groups, and countries.

 

A.     Summarization by Country:

 

The summary is attached at the end of the two MCs, since it combines the results of both text and network visual analytics.

 

B.     Hypothesis of the Next Activity:

 

All clues show that there is a group of arm dealers in Russia and Ukraine, led by Nicolai Kuryakin and followed by Mikhail Dombrovski and Arkadi Borodinski. They are arranging a meeting in Burj hotel, Dubai, UAE from April 15 to April 23, when they will trade with worldwide arms buyers and arrange for the shipment.

 

C.    Analytics Processes:

 

The process includes three steps: pre-process the data, compose visualization, and finally conduct visual analytics for the tasks.

 

C.1   Data Pre-Processing:

 

We start by segment the raw text data provided into snippets according to the title line of each smaller report/message/post. These titles are well-formatted, so that both snippet boundary and the exact time of each snippet could be extracted by regular expression matching.

We then extract named entitles, including people/location/organization, with the Stanford NE Parser (http://nlp.stanford.edu/software/CRF-NER.shtml).  We further proceed by mapping the location into countries and also use POS parser to extract verbs from the snippets for activity summarization.

As we find there are quite a lot of countries in the data, we conduct a country clustering over the country co-occurrence graph. As in Fig 1(a), the node size indicates the occurrence count by snippets, edges indicates co-occurrence within the same snippets. To simplify the graph, we filter out all the nodes with count less than 4 and edges with count less than 2, and then remove the three bridge nodes (which should be the intermediate countries, but not the ones requiring arms). Then 8 groups are found, we name them the topics, with the central country as each topic name. Each snippet is then mapped to one topic. There are some manual efforts in further checking the topic (country) classification, say 30 min for one people. Also, we manually check the named entities (especially the people) and create a name map to translate typos or short names into standard name. The efforts are approximately 1 hour for one people.

 

 

(a)                                                                                           (b)

Figure 1. Country Graphs: (a) Original Country Co-Occurrence Graph; (b) After node/edge count filtering and deleting the bridge node of “Russia”, “Ukraine” and “UAE”

 

C.2   Visualization:

 

After the first step, we get a time-evolving document (snippet) collection with several tagged fields and a topic (country group) classification. We leverage VisWorks TIARA to visualize it.

Fig. 2 gives an overview of the visualization, where X axis is mapped to time, Y axis gives the document number of specific time period, and the vertical layers correspond to topics. The keywords in each layer shows the time-sensitive entitles within each topic. The keyword size indicates the occurrence count. There is a field navigation panel in the left to control the category of the keywords shown.    

 

Figure 2. Overview of the data with TIARA, all the keyword fields are selected.

 

 

C.3   Visual Analytics:

 

To meet the requirement of the task, we drill-down to the content of each country group with TIARA. Here we only show how we analyze the arms dealing info in Venezuela. The analyses of other countries are of the same methodology.

We select only the Venezuela topic in TIARA, choose the time period with documents of this topic and double click the topic trend to gain a detailed layered view by field, as shown in Fig. 3.

In the activity sub-trend, we clearly find the “call-send-transfer(money)-meet” theme changes. So we hypothesis that the local Venezuela buyer first call to discuss with the dealer, then send something (maybe the arms list), transfer the money and finally arrange to meet.

We further locate some key persons in the arms dealing of Venezuela. We find “jhon” appears throughout the timeline, so we click on this keyword and retrieve the snippet content containing “jhon”, as shown in Fig.5. By reading these snippets by time, we learn that “jhon” is an intermediate dealer connecting Venezuela buyer to “jtomski”. (from the analysis in other countries, we hypothesis “jtomski” to be Mikhail Dombrovski, since he had used email address “Joetomsk@au.ru”)

To help summarize the situation by the end of the period, we drill-in to this most recent time (from Dec. 2008 to latest), and click on the layer to let TIARA extract the most important sentences in the snippets, as shown in Fig. 5. The sentences are ranked by synthesizing weight of entities (people/location/activity/time) within each. The top five sentences are highlighted. From these, we conclude that both jhon (the intermediate) and Vwhombre (probably the local buyer) will meet jt (the arms dealer) in UAE on late April, 2009.

 

 

Figure 3. Detailed view of Venezuela topic (country group). Sub-layers indicate people/place/activity(verb) keywords respectively.

 

Figure 4. Drill into the key people “jhon”, the snippets containing “jhon” and its alias “jg” are retrieved and listed in the right panel, sorted by time.

 

Figure 5. Select the most recent info of this topic and click on the layer to show sentence summarizations in the right panel.

 


MC1.2:  Illustrate the associations among the players in the arms dealing through a social network.  If there are linkages among countries, please highlight these as well in the social network.  Our analysts are interested in seeing different views of the social network that might help them in counterintelligence activities (people, places, activities, communication patterns that are key to the network).

 

For this task, we start from the player social network in Fig. 1. The people name extraction method is the same with task 1.1. In this figure, people icon size is mapped to people occurrence count in snippets. while edge indicates the co-occurrence of people in the same snippet. Figure 1 shows that the player network is composed of several connected components, where “Nicolai” and “Mikhail” connect the largest component together.

 

Figure 2 places the national flags over people to indicate the country, while inter-country connections are highlighted in orange. Still, “Nicolai”,“Mikhail”, as well as “Arkadi” and “Boonmee” this time, behave as international players.

 

Figure 3 further maps the graph betweenness centrality score to the icon size, then “Nicolai”,“Mikhail”,“Arkadi”,“Saleh”,“Nahid” is noticed as bridges in the network.

 

Figure 4 introduces additional locations (such as Dubai and Burj) and organizations (such as jhangvi) into the graph. Then it is found the graph becomes almost one connected component, with Dubai as the primary bridge.

 

Figure 5 maps the country group (topic) category (extracted in task 1.1) over each people, and highlight the 1-hop closure of “Dubai”. We find that “Dubai” connects to all the 8 country groups either directly or through one intermediate potential dealer.

 

 

Figure 1. Player network with node size mapped to occurrence count.

 

Figure 2. Player network highlighting country info and inter-country connections.

 

Figure 3. Player network with node size mapped to graph betweenness centrality.

 

Figure 4. Player+Location+Organization network, the whole picture is almost interconnected by the new locations and organizations.

 

Figure 5. Synthesized network with country group category mapped to node color.

 

A.     Summarization by Country: (the result by combining text and network visual analytics)

 

A.1  Pakistan:

      Activities:

u       Lashkar-e-Jhangvi is suspected to have planned a bomb attack in late Feb, 2008.

u       Another plan is forming by Lashkar-e-Jhangvi reminders near Karachi on a day of religion.

u       Bukhari, a suspected top leader of Lashkar-e-Jhangvi, transfers money out twice, first in Feb.2008 to Moscow, another in Nov.11, 2008, to Saudi. He acts frequently from June to September, 2008, including receiving boxes.

 

Situations:

u       Bukhari will go to Burj hotel, Dubai on April 18, 2009.

 

 

A.2   Thailand:

      Activities:

u       One cargo plane is seized in Bangkok on Feb 10, 2008, carrying weapons from North Korea. The plane belongs to Arkadi. The arms plan to arrive at Iran.

u       Boonmee frequently contacts with Nicolai, Arkadi and Lim.

                                                                            

Situations:

u       The arms dealer for Iran (probably Arkadi) will be in Dubai, April 21th, 2009, with Nicolai.

u       Boonmee will meet Nicolai Kuryakin in Dubai, April 17th, 2009.

 

Hypothesis:

u       Arkadi is a Ukraine arms dealer focusing in Asia (Iran, Thailand), with Nicolai.

u       Boonmee is a local arms dealer in southeast Asia. He got arms from Arkadi, Nicolai, and sold to Burma through Lim.

 

A.3   Israel:

      Activities:

u       MFJ (terrorist group in Israel), lead by Kasem, is preparing for terrorism event, and request arms by May 2009.

u       Khouri, MFJ sympathizer, Kasem’s friend, gets a contact (from Russia Army) for ammunition. Kasem and Anka finally decide to take flight to Dubai. Kasem will arrange the money to arrive.

 

Situations:

u       Kasem, Khouri and Anka will fly to Dubai on April 18th 2009.

 

 

A.4   Venezuela:

      Activities:

u       Barcelona gets contact of arms dealer, jtomski (probably Mikhail), from Jhon.

u       November 14, 2008, Barcelona is told to send money (transferred later) to the green man. The meeting will be in spring, 2009.

u       Jhon requests hundreds of arms from jtomski on October, 2008.

u       vwhombre requests arms from jtomski

 

Situations:

u       Jhon will have a meeting with jtomski on April 23, 2008 at Arab Sail.

u       Vwhombre will meet jtomski on April 22, 2009 at hotel in Dubai.

 

Hypothesis:

u       Venezuela buyer (probably “Vwhombre”) connects with Mikhail through Jhon.

u       Jhon is probably a local arms dealer in South America, smuggling arms from Russian to local.

 

 

A.5   Syria:

      Activities:

u       Baltasar, a suspected leader of a Syria group requests arms.

u       They connect to the arms dealer through a Moscow professor. They confirm the required arms list with the professor.

 

Situations:

u       Celik, Hakan and Kaya will travel to Dubai on April, 16.

u       Baltasar, Adad and Ashur will go to Dubai on April, 18, staying in the same hotel close to Burj (for meeting).

 

Hypothesis:

u       Moscow professor (probably Mikhail) belongs to Russian arms dealer group, and arranges business with Syria buyer.

 

 

A.6   Nigeria:

      Activities:

u       Funsho Kapolalum (probably DR. GEORGE’s partner), arranges with Mikhail to transfer money out Nigeria.

u       A list is sent under by Dr. George, suspected to be the required arms list.

u       Mikhail calls Nigeria (probably George) for their deals. They agree to meet in Dubai.

 

Situations:

u       They will meet on 15 April, 2009 in Dubai.

 

 

 

A.7   Kenya:

      Activities:

u       During the MP training, September 2008, some arms have been moved and lost.

u       Arms are found in house of Thabiti Otieno, October 2008. He is arrested together with MP officers.

u       Thabiti Otieno and his wife Nahid Owiti are charged for ammunition possessing, but later acquitted.

u       A cargo ship carrying arms is captured by pirates in October, 2008. The owner pays ransom on March 2009.

 

Situations:

u       Moscow  (probably Mikhail) calls Kenya (maybe Thabiti). He expects ship’s jewels (probably arms) to arrive Dubai on April 17, 2009. Nahid and probably Thabiti will meet Nicholai in Dubai.

 

Hypothesis:

u       Thabiti and Nahid could be the local contact of arms dealing in African (Kenya and Sudan). They also smuggle arms from Kenya government. They help to ship the arms for dealing at Dubai.

 

 

A.8   Yemen:

      Activities:

u       Weapons are seized in Dafa. Saleh Ahmed, leader of the smuggling, fled to Yemen.

u       Saleh is supplying the rebels of Yemen and Saudi Arabic.

u       Saleh arranges to buy ammunition from Mikhail.

 

Situations:

u       Saleh and Mikhail (also Nicolai) will meet at the Burj, Dubai on 19 April, 2009..

 

Hypothesis:

u       Saleh is a local arms dealer in Middle East (Yemen and Saudi) who smuggle arms from Russia.